attribution score
- North America > United States > Michigan (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > United Kingdom > England > Leicestershire > Leicester (0.04)
- (7 more...)
- Leisure & Entertainment > Sports > Football (1.00)
- Government (1.00)
- Media (0.93)
- Information Technology (0.67)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > California > Orange County > Irvine (0.04)
- Workflow (1.00)
- Research Report > New Finding (1.00)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > Switzerland (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Health & Medicine > Nuclear Medicine (0.46)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
AD-DROP: Attribution-DrivenDropoutforRobust LanguageModelFine-Tuning
Pre-training large language models (PrLMs) on massive unlabeled corpora and fine-tuning them on downstream tasks has become a new paradigm [1-3]. Their success can be partly attributed to the self-attention mechanism [4], yet these self-attention networks are often redundant [5, 6] and tend to cause overfitting when fine-tuned on downstream tasks due to the mismatch between their overparameterization and the limited annotated data [7-13]. To address this issue, various regularization techniques such as data augmentation [14, 15], adversarial training [16, 17]), and dropout-based methods [11,13,18]have been developed.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
What Triggers my Model? Contrastive Explanations Inform Gender Choices by Translation Models
Hackenbuchner, Janiça, Tezcan, Arda, Daems, Joke
Interpretability can be implemented as a means to understand decisions taken by (black box) models, such as machine translation (MT) or large language models (LLMs). Yet, research in this area has been limited in relation to a manifested problem in these models: gender bias. With this research, we aim to move away from simply measuring bias to exploring its origins. Working with gender-ambiguous natural source data, this study examines which context, in the form of input tokens in the source sentence, influences (or triggers) the translation model choice of a certain gender inflection in the target language. To analyse this, we use contrastive explanations and compute saliency attribution. We first address the challenge of a lacking scoring threshold and specifically examine different attribution levels of source words on the model gender decisions in the translation. We compare salient source words with human perceptions of gender and demonstrate a noticeable overlap between human perceptions and model attribution. Additionally, we provide a linguistic analysis of salient words. Our work showcases the relevance of understanding model translation decisions in terms of gender, how this compares to human decisions and that this information should be leveraged to mitigate gender bias.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.88)
MACIE: Multi-Agent Causal Intelligence Explainer for Collective Behavior Understanding
As Multi Agent Reinforcement Learning systems are used in safety critical applications. Understanding why agents make decisions and how they achieve collective behavior is crucial. Existing explainable AI methods struggle in multi agent settings. They fail to attribute collective outcomes to individuals, quantify emergent behaviors, or capture complex interactions. We present MACIE Multi Agent Causal Intelligence Explainer, a framework combining structural causal models, interventional counterfactuals, and Shapley values to provide comprehensive explanations. MACIE addresses three questions. First, each agent's causal contribution using interventional attribution scores. Second, system level emergent intelligence through synergy metrics separating collective effects from individual contributions. Third, actionable explanations using natural language narratives synthesizing causal insights. We evaluate MACIE across four MARL scenarios: cooperative, competitive, and mixed motive. Results show accurate outcome attribution, mean phi_i equals 5.07, standard deviation less than 0.05, detection of positive emergence in cooperative tasks, synergy index up to 0.461, and efficient computation, 0.79 seconds per dataset on CPU. MACIE uniquely combines causal rigor, emergence quantification, and multi agent support while remaining practical for real time use. This represents a step toward interpretable, trustworthy, and accountable multi agent AI.
- Leisure & Entertainment > Games (0.46)
- Transportation (0.46)
- Energy (0.46)
Interpreting the Effects of Quantization on LLMs
Singh, Manpreet, Sajjad, Hassan
Quantization offers a practical solution to deploy LLMs in resource-constraint environments. However, its impact on internal representations remains understudied, raising questions about the reliability of quantized models. In this study, we employ a range of interpretability techniques to investigate how quantization affects model and neuron behavior. We analyze multiple LLMs under 4-bit and 8-bit quantization. Our findings reveal that the impact of quantization on model calibration is generally minor. Analysis of neuron activations indicates that the number of dead neurons, i.e., those with activation values close to 0 across the dataset, remains consistent regardless of quantization. In terms of neuron contribution to predictions, we observe that smaller full precision models exhibit fewer salient neurons, whereas larger models tend to have more, with the exception of Llama-2-7B. The effect of quantization on neuron redundancy varies across models. Overall, our findings suggest that effect of quantization may vary by model and tasks, however, we did not observe any drastic change which may discourage the use of quantization as a reliable model compression technique.
- North America > United States (0.46)
- North America > Canada (0.28)
- North America > Mexico (0.28)
- Asia > Middle East (0.28)